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Abstract 

The incorporation of uncertainties to calculations of signal significance in planned 
experiments is an actual task. We present a procedure of taking into account 
the effects of one sided systematic errors related to nonexact knowledge of 
signal and background cross sections on the discovery potential of an experi- 
ments. A method of a treatment of statistical errors of the expected signal and 
background rates is proposed. The interrelation between Gamma- and Poisson 
distributions is demonstrated. 

1. Introduction 

One of the common goals in the forthcoming experiments is the search for new phenomena. In estima- 
tion of the discovery potential of the planned experiments the background cross section (for example, 
the Standard Model cross section) is calculated and, for the given integrated luminosity L, the average 
number of background events is rif, = ■ L. Suppose the existence of new physics leads to additional 
nonzero signal cross section ag with the same signature as for the background cross section that results 
in the prediction of the additional average number of signal events rig = as ■ L for the integrated lumi- 
nosity L. The total average number of the events is < n >= + nj, = {a^ + Ub) • L. So, as a result 
of new physics existence, we expect an excess of the average number of events. The probability of the 
realization of n events in the experiment is described by Poisson distribution ^ ^ 

f{n;X) = —e-\ (1) 



n 



In the report the approach to determination of the "significance" of predicted signal on new physics 
in concern to the predicted background is considered. This approach is based on the analysis of uncer- 
tainty [||, 0], which will take place under the future hypotheses testing about the existence of a new 
phenomenon in Nature. We consider a simple statistical hypothesis Hq: new physics is present in Nature 
(i.e. \ = Ug + rib) against a simple alternative hypothesis Hi: new physics is absent (A = Ub). The 
value of uncertainty is defined by the values of the probability to reject the hypothesis Hq when it is true 
(Type I error a) and the probability to accept the hypothesis Hq when the hypothesis Hi is true (Type II 
error f3). The concept of the "statistical significance" of an observation is reviewed in the ref. [Q]. All 
considerations in the paper are restricted to the most simple case of one channel counting experiment. 
More advanced statistical analysis based on other technique can be found, for example, in the refs. [0]. 



2. "Signal significance" in planned experiment 

"Common practice is to express the significance of an enhancement by quoting the number of standard 
deviations" [^. Let us define the "signal significance" (see, for example, ref. [||]) as "effective signifi- 



cance s 



f{'^'^rib) = —7=l exp{-x^ /2)dx, (2) 

n=no+l V 2^ Js 

where uq is the critical value for hypotheses testing (if the observed value n < no then we reject Hq else 
we accept Hq). In this case the system 



oo 

(3= fin;nb)<A 

n=no+l 



(3) 



oo 

l-a= f{n]ns + nb) (4) 

n=no+l 

allows us to construct dependences Us versus ni, on given value of Type II error /? < A (the probabil- 
ity that the observed number of events in planned experiment will be greater than critical value uq if 
hypothesis Hi is true) and given acceptance 1 — a (the same probability if hypothesis Hq is true). If 
A = 2.85 • 10^'' (s > 5, i.e. the value no has 5a deviation from average background nf,), the corre- 
sponding acceptance can be named the probability of discovery and the dependence of Ug versus 715 - the 
5a discovery curve; if A = 0.0014 (s > 3), the acceptance is the probability of strong evidence, and, if 
A = 0.0228 (s > 2), the acceptance is the probability of weak evidence. The case of weak evidence for 
50% acceptance (s = 2) is shown in Fig.l. The 5a discovery, 3cr strong evidence, and 2a weak evidence 
curves for 90% acceptance are presented in Fig.2. 



3. Effects of one sided systematic errors on the discovery potential 

We consider here forthcoming experiments to search for new physics. In this case we must take into 
account the systematic uncertainty which has theoretical origin without any statistical properties. For 
example, two loop corrections for most reactions at present are not known. In principle, it is "repro- 
ducible inaccuracy introduced by faulty technique" [10] and according to [11] it contains the sense of 
"incompetence". If the predicted number of background events strongly exceeds the predicted number of 
signal events the discovery potential is the sensitive to this uncertainty. In this case we can only estimate 
the scale of influence of background uncertainty on the observabihty of signal, i.e. we can point the 
admissible level of uncertainty in theoretical calculations for given experiment proposal. 

Suppose uncertainty in the calculation of exact background cross section is determined by param- 
eter 5, i.e. the exact cross section lies in the interval (cr;,, (jf,(l + 5)) and the exact value of the average 
number of background events lies in the interval (n^, 71^(1 + (5)). Let us suppose rif, » n^. As we know 
nothing about possible values of average number of background events, we consider the worst case [^. 
Taking into account formulae (3) and (4) we have the formulae 



[3= /(n;nfe(l + 5)) < A (5) 

n=no+l 

00 

l-a= Y f{n;nh + ns). (6) 

n=no+l 

Formulae (5,6) realize the worst case when the background cross section ab{l + 6) is the maximal one, 
but we think that both the signal and the background cross sections are minimal. 

The example of using these formulae is shown in Fig.3. We see the sample of 200 (with, as 
expected, 100 background) events that wiU be enough to reach 90% probability of discovery with 25% 
systematic uncertainty of theoretical estimation of background. 



4. An account of statistical uncertainty in the determination of Ug and Uh 

Usually, an experimentalist would extract the numbers and U), from a Monte Carlo simulation of the 
planned experiment, which results in the statistical errors. If the probability of true value of parameter of 
Poisson distribution (the conditional probability) to be equal to any value of A > in the case when one 




Fig. 1: The case nj, ^ 1. Poisson distributions with parameters A = 1000 (left) and A = 1064 (right). Here 1 — a = 0.5 and 
P = 0.02275 (i.e. s = 2). 
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Fig. 2: Dependences Us versus rib for 1 — a = 0.9 and for different values of /3. 




Fig. 3: Discovery probability versus Us for different values of systematic uncertainty S for the case Ha 
constructed under condition /3 — 2.85 • 10"'^. 
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observation ni, = h or Ug + Ub = h is known we have to take into account the statistical uncertainties in 
the determination of these values. 

Let us write down the density of Gamma distribution Ta,n+i as 

„n+l 



9n{a, A) 



r n+1 



(V) 



where a is a scale parameter, n + 1 > is a shape parameter, A > is a random variable, and r(n + 1) 
n ! is a Gamma function. 

Let us set a = 1, then for each n a continuous function 



5'n(A) 



A'^ 



A > 0, n> -1 



(8) 



is the density of Gamma distribution Ti with the scale parameter a = 1 (see Fig.4). The mean, 
mode, and variance of this distribution are given by n + 1, n, and n + 1, respectively. 

As it follows from the article [12| and is clearly seen from the identity [[ij] (Fig.5) 
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J2 /(n;Ai)+/ 5n(A)a!A + ^/(n;A2) = l, i.e. 

=n+l '^i n=0 
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dx+Y: 
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n=0 



(9) 



for any Ai > and A2 > 0, the probability of true value of parameter of Poisson distribution to be equal 
to the value of A in the case of one observation h has probability density of Gamma distribution Fi 1+^. 
The equation (9) shows that we can mix Bayesian and frequentist probabilities in the given approach. 
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Fig. 4: The behaviour of the probability density of the true value of parameter A for the Poisson distribution in case of n 

A" 

observed events versus A and n. Here f(n; A) = gn(\) = — re~ is both the Poisson distribution with the parameter A along 

n! 

the axis n and the Gamma distribution with a shape parameter n + 1 and a scale parameter 1 along the axis A. 




Fig. 5: The Poisson distributions f{n, A) for A's determined by the confidence limits Ai = 1.51 and A2 = 8.36 in case of the 
observed number of events n = 4 are shown. The probability density of Gamma distribution with a scale parameter a — 1 and 
a shape parameter n + l = ri+l = 5is shown within this confidence interval. 



It allows to transform the probability distributions f{n;ns + rib) and f{n;ni,) accordingly to 
calculate the probability of discovery [ 14 ] 



1 - a = 1 - / gn,+ntW f(^'^ ^)^'^ = ^~Y1 
•^0 n=0 n=0 



n 

^ns+nj,+n 



(10) 



where the critical value no under the future hypotheses testing about the observability is chosen so that 
the Type II error 



/3 



n=no+l n=n.o+l 



(11) 



could be less or equal to 2.85 • 10 ^. Here C]y is 



iV! 



Also we suppose that the Monte Carlo 



n\{N-ny.' 

luminosity is exactly the same as the data luminosity later in the experiment. The behaviour of discovery 
probability with and without account for this uncertainty is shown in Fig. 6. 
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CL — integrated luminosity o" planned 
experiment 

MCL - integrated lumincaity Monte Carlo data 
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Fig. 6: Discovery probability versus Hs witii and withiout account for statistical uncertainty in determination of Us and nt. The 
case Us = rib- The curves are constructed under condition /? = 2.85 ■ 10~^. 



The Poisson distributed random values have a property: if ~ Pois{X,i), i = 1,2, ... ,m then 

m m 

E ~ -Pois(E ^i)- It means that if we have m observations ni, n2, . . ., of the same random value 



i=l 



i=l 



^ ~ Pois{X), we can consider these observations as one observation °f the Poisson distributed 



i=l 



random value with parameter m • A. According to eq.(9) the probability of true value of parameter of 
this Poisson distribution has probability density of Gamma distribution 1+^"* n • U^ii^g scale 
parameter m one can show that the probability of true value of parameter of Poisson distribution in the 
case of m observations of the random value ^ ~ Pois{\) has probability density of Gamma distribution 



Let us assume that the integrated luminosity of planned experiment is L and the integrated lumi- 
nosity of Monte Carlo data is m • L. For instance, we can divide the Monte Carlo data into m parts with 
luminosity corresponding to the planned experiment. The result of Monte Carlo experiment in this case 
looks as set of m pairs of numbers ( (n;,)j, (n;,)j + (ns)i ), where (n;,)j and {11^)1 are the numbers of 

m 

background and signal events observed in each part of Monte Carlo data. Let us denote Ni, = ^ 

1=1 

m 

and Ng+b = (("^)» + (.'^b)i)- Correspondingly (see page 98, [0]), 

i=l 

/?=/ G(iV„m,A) f{n;X)dX= C^^^^^j-^-^:^ < A, (13) 

n=no+l n=no+l ^ ' 

poo "0 "0 l+Ns+b 

!-« = !-/ G{N,^,,m, A) ^ /(n; X)dX = 1 - ^ . ^l+^.,,+n • (14) 

•^'J n=0 n=0 

As a result, we have a generalized system of equations for the case of different luminosity in planned 
data and Monte Carlo data. The set of values C?^4_„- r-rr. — tt, n = 0, 1, ... is a negative bi- 

nomial (Pascal) distribution with real parameters + 1 and — ^ — , mean value — ~^ — and variance 

m + 1 m 

(l + m)(l+iV) 



5. Conclusions 

In this paper we have described a method to estimate the discovery potential on new physics in planned 
experiments where only the average number of background rif, and signal Us events is known. The 
"effective significance" s of signal for given probability of observation is discussed. We also estimate 
the influence of systematic uncertainty related to non-exact knowledge of signal and background cross 
sections on the probability to discover new physics in planned experiments. An account of such kind 
of systematics is very essential in the search for supersymmetry and leads to an essential decrease in 
the probability to discover new physics in future experiments. The texts of programs can be found in 
^ttp://home.cern.ch/bityukov| . A method for account of statistical uncertainties in determination of 
mean numbers of signal and background events is proposed. Appendix A demonstrates the interrelation 
between Gamma- and Poisson distributions. The approach for estimation of exclusion limits on new 
physics is described in Appendix B. 
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A The interrelation between gamma- and Poisson distributions 

The identity (9) (Fig.5) 

E f{n; Ai) + / gn{X)d\ + ^ /(^; A2) = 1 , 

n=n+l •'^^ n=0 

can be easy generalized, as an example to 

00 m fX+i km+i-i 

I 1 — 1 J, I 1 



+ fiko;Xm+i) = l (15) 

for any real Aj > 0, z G [1, m + 1], integer m > 0, ki > A;/_i > 0, / G [1, m], ko = 0. 
As a result of such type generalizations we have got 

gm{X)dX+ E /(^;^2)+/ 9n{X)dX- ^ /(*;Ai) = 0, (16) 

1 i=n+l •'^^ i=n+l 



A2 ^mg-A ™ y „-A2 .Ai ^ng-A m y_g-Ai 



I.e. 

/•A2 ^mg 

JAi ml ' ' "^i ' A2 

for any real Ai > 0, A2 > 0, and integer m > n > 0. 
B Exclusion limits [|, |] 

It is important to know the range in which a planned experiment can exclude presence of signal at given 
confidence level (1 — e). It means that we will have uncertainty in future hypotheses testing about non- 
observation of signal which equals to or less than e. In refs.|[T^, |l^] different methods to derive exclusion 
limits in prospective studies have been suggested. 

We propose to use the relative uncertainty 

a + B 

(17) 



2 -{a + (5) 

which will take place under hypotheses testing Hq versus Hi. It is a probability of wrong decision. This 
probability k in case of applying the equal-probability test [Q] is a minimal relative value of the number 
of wrong decisions in the future hypotheses testing for Poisson distributions. It is the uncertainty in the 
observability of the new phenomenon. Note that in this case the probability of correct decision 1 — k 
(the relative number of correct decisions) may be considered as a distance between two distributions (the 
measure of distinguishability of two Poisson processes) in frequentist sense. This distance changes from 
zero up to unity (as a result of the definition of equal-probability test). 



'See, also, page 97 in ref. page 358 in ref. Jis] ] and formula A7 in ref. [[l6||. 



